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ABSTRACT 



A technique for isolating faults in a communication network 
is described. The techniques can be utilized in high speed 
communications networks such as all-optical networks 
(AONs). The technique is distributed, requires only local 
network node information and can localize attacks for a 
variety of network applications. The technique is particu- 
larly well suited to the problem of attack propagation which 
arises in AONs. The technique finds application in a variety 
of network restoration paradigms, including but not limited 
to automatic protection switching and loopback protection 
and provides proper network operation reduced, or in some 
cases no data loss arid bounded delay time regardless of the 
location of the attack or the physical span of the network. 
Since the technique is distributed, and its associated delays 
do not depend on the number of nodes in the network. Hence 
the technique avoids the computational complexity inherent 
to centralized approaches. It is thus scalable and relatively 
rapid. Furthermore, the delays in attack isolation do not 
depend on the transmission delays in the network. A network 
management system can therefore offer hard upper-bounds 
on the loss of data due to failures or attacks. Fault localiza- 
tion with centralized algorithms depends on transmission 
delays, which are proportional to the distance traversed by 
the data. Since the described techniques for fault localization 
are not dependent on centralized computations, the tech- 
niques are equally applicable to local area networks, met- 
ropolitan area networks, or wide area networks. 

16 Claims, 16 Drawing Sheets 
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FAULT ISOLATION FOR COMMUNICATION 
NETWORKS FOR ISOLATING THE SOURCE 
OF FAULTS COMPRISING ATTACKS, 
FAILURES, AND OTHER NETWORK 
PROPAGATING ERRORS 

GOVERNMENT RIGHTS 

This invention was made with government support under 
Contract No. F19628-95C-0002 awarded by the Department 
of the Air Force. The Government has certain rights in this 
invention. 

RELATED APPLICATIONS 
Not applicable. 

FIELD OF THE INVENTION 

This invention relates generally to communication net- 
works and more particularly to localizing attacks or failures 
in communications networks. 

BACKGROUND OF THE INVENTION 

As is known in the art, there is a trend to provide 
communication networks which operate with increasing 
information capacity. This trend has led to the use of 
transmission media and components capable of providing 
information over relatively large signal bandwidths. One 
type of transmission media capable of providing such band- 
widths is an optical carrier transmission media such as glass 
fibers which are also referred to as optical fibers or more 
simply fibers. 

As is also known, an all-optical network (AON) refers to 
a network which does not contain electronic processing 
components. AONs utilize all-optical switching components 
which afford network functionality and all-optical amplifi- 
cation components which counteract attenuation of the opti- 
cal signals through the network. Since AONs do not contain 
electronic processing components, AONs avoid network 
bottlenecks caused by such electronic processing elements. 

Because AONs support delivery of large amounts of 
information, there is a trend to utilize AONs in those 
network applications which require communications rates in 
the range of 1 terabit per second and greater. While network 
architectures and implementations of AONs vary, substan- 
tially all of the architectures and implementations utilize 
devices or components such as optical switches, couplers, 
filters, attenuators, circulators and amplifiers. These building 
block devices are coupled together in particular ways to 
provide the AONs having particular characteristics. 

The devices which perform switching and amplification 
of optical signals have certain drawbacks. In particular, 
owing imperfections and necessary physical tolerances asso- 
ciated with fabricating practical components, the compo- 
nents allow so-called "leakage signals" to propagate 
between signals ports and signal paths of the devices. Ideal 
device signal paths are ideally isolated from each other. Such 
leakage signals are often referred to as "crosstalk signals'* 
and components which exhibit such leakage characteristics, 
are said to have a "crosstalk" characteristic. 

The limitations in the isolation due to the physical prop- 
erties of switches and amplifiers can be exploited by a 
nefarious user. In particular, a nefarious user on one signal 
channel can affect or attack other signal channels having 
signal paths or routes which share devices with the nefarious 
user's channel. Since signals flow unchecked through the 
AON, the nefarious user may use a legitimate means of 
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accessing the network to effect a service disruption attack, 
causing a quality of service degradation or outright service 
denial. The limitations in the operating characteristics of 
optical components in AONs thus have important security 

5 ramifications. 

One important security issue for optical networks is that 
service disruption attacks can propagate through a network. 
Propagation of attacks results in the occurrence of failures in 
portions of the network beyond where the attack originated. 

10 This is in contrast to failure due to component fatigue. 
Failures due to component fatigue generally will not propa- 
gate through the network but will affect a limited number of 
nodes and components in the network. Since the mecha- 
nisms and consequences of a service disruption attack are 

15 different from those of a failure, it is necessary to provide 
different responses to attacks and failures. Thus, it is impor- 
tant to have the ability to differentiate between a failure and 
an attack and to have the ability to locate the source of an 
attack. 

20 Referring to FIG. 1, an example of an attack which 
propagates through a switch 10 and an amplifier 16 is 
shown. The switch 10 includes switch ports 10a-10a* with a 
first switch channel 12a provided between switch ports 10a 
and 10c and a second switch channel 12b provided between 

25 switch ports 10b and lOd. The switch 10 has a finite amount 
of isolation between the first and second switch channels 
12a, 12b, Owing to the finite isolation characteristics of the 
switch 10, a portion of a signal propagating along the first 
switch channel 12a can be coupled to the second switch 

30 channel 12b through a so-called "leakage" or "crosstalk" 
signal path or channel 14. Thus, a crosstalk signal 15 
propagates from the first switch channel 12a through the 
crosstalk channel 14 to the second switch channel 12b. 

35 The output of the second switch channel 12b is coupled 
through switch port lOd to an input port 16a of a two- 
channel amplifier 16. The amplifier receives a second chan- 
nel 12c at a second amplifier input port 16b. If the crosstalk 
signal 15 on channel 12b is provided having a particularly 

^ high signal level, the crosstalk signal 15 propagating in 
channel 12b of the amplifier 16 couples power from the 
signal propagating on the second amplifier channel 12c 
thereby reducing the signal level of the signal propagating 
on the channel 12c. This is referred to as a gain competition 

45 attack. It should thus be rioted that a signal propagating on 
the first channel 12a can be used to affect the third channel 
12c, even though the channels 12a and 12c are routed 
through distinct components (i.e. channel 12a is routed 
through the switch 10 and channel 12c is routed through the 

50 amplifier 16). 

It should also be noted that in this particular example, the 
gain competition attack was executed via a signal inserted 
into the channel 12b via the crosstalk channel 14 existent in 
the switch 10. Thus, a user with a particularly strong signal 

55 can couple power from the signals of other uses without 
directly accessing an amplifier component. With this 
technique, a nefarious user can disrupt several users who 
share amplifiers which receive a gain competition signal 
from the nefarious user via a different component propagat- 

50 iog on the channel 12c. 

FIG. 2 illustrates one scenario for the necessity to differ- 
entiate an attack carried out by the network traffic from a 
physical failure and when it is important to be able to 
localize the source of the attack. In FIG. 2, a portion of a 

65 network includes a first network node 17a provided by a first 
element which here corresponds to a switch 10 and a second 
network node 176 provided by a second element which here 
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corresponds to a second switch 18. It should be noted that allow the network to properly recover from attacks, it is 

the nodes 17a, 176 are here shown as switches for purposes necessary to ascertain attacks carried out by network traffic 

of illustration only and that in other embodiments, the nodes and to localize the source of these attacks. 

17a, 17b may be provided from elements other than i n networks having relatively high data transmission 

switches. In this example, it is assumed that each of the 5 ra tes, ultrafast restoration is typically preplanned and based 

nodes 17a, 176 guards against jamming attacks by pinpoint- U po a local information (i.e. information local to a network 

ing any channel on which is propagating a signal having a node). The restoration route is generally stored in a memory 

signal level higher than a predetermined threshold level and device within the network nodes. This approach avoids the 

then disconnecting the channel on which the high level delays associated with dynamically computing routes once a 

signal propagates. 10 failure occurs. To utilize such a pre-planned or pre-stored 

In FIG. 2, the switch 10 includes switch ports lOo-lOd approach, it is thus necessary to store the alternate route 
with a first switch channel 12a provided between switch information at each of the network nodes, 
ports 10a and 10c and a second switch channel 12b provided As explained above in conjunction with FIG. 2, the 
between switch ports 10b and lOrf. The switch 10 has a finite techniques for responding to signal transmission problems 
amount of isolation between the first and second switch 15 due to a failure which occurs because of natural fatigue of 
channels 12a, 12b. Channels 12a, 126 both propagate components or physical sabotage of the network are not well 
through the node 17a, which in this particular example suited to responding to signal transmission problems caused 
corresponds to the switch 10a, and both channels 12a, 126 by the signals themselves. For example, one technique for 
propagate signals having the same carrier signal wavelength. recovering from a node failure (i.e. a failure due to natural 
Owing to the finite isolation characteristics between chan- 20 fatigue of components or physical sabotage of the network) 
nels 12a, 12b in the switch 10, a portion of a signal is to reroute tra file away from me failed node. This technique 
propagating along the first switch channel 12a can be ^ used in synchronous optical networks (SONET) and 
coupled to the second switch channel 12b through a synchronous digital hierarchy (SDH) bidirectional self- 
crosstalk channel 14. Thus, the crosstalk signal 15 propa- healing rings (SHRS). In a SONET/SDH bidirectional SHR, 
gates from the first switch channel 12a through the crosstalk 25 if the traffic itself is the cause of the failure, as is the case in 
channel 14 to the second switch channel 12b. me amplifier and switch attacks discussed above, then 

If an excessively powerful signal (e.g. one having a signal failures may be caused throughout the network without any 

level equal to or greater than the predetermined threshold restoration. 

level) is introduced via switch port 10a onto channel 12a, Another technique for recovering from a failure is to 
then channel 12a will be disconnected. The crosstalk signal localize component failures. Once the failed components are 
15, however, from channel 12a is superimposed upon chan- localized, they can be physically removed from the network 
nel Mb at node 17a. If the carrier signals on the two and repaired or replaced with other components. One prob- 
channels 12a, 12b have substantially the same wavelength, lem with this technique, however, is that it results in service 
the signal levels of the two carrier signals may add. Thus, the degradation or denial while the failed component or corn- 
signal propagating in channel 12b, in turn, may exceed the ponents are being identified and repaired or replaced, 
predetermined threshold signal level. Another problem with this technique is that it may take a 
The crosstalk signal 15 and the carrier signal propagating relatively long period of time before the failed component or 
on channel Mb are coupled to the second switch 18 which components can be identified and repaired or replaced, 
is provided having first and second channels 12/?, 12c. ^ Furthermore, since each failed component must be physi- 
S witch 18, like switch 10 has a finite amount of isolation cally located and repaired or replaced, further time delays 
between the first and second switch channels Mb, 12c. can be incurred. 

Channels 126, 12c both propagate through the same node Thus, if techniques intended to respond to naturally 

17 b y which in this particular example corresponds to the occurring failures are applied to cases of service disruption 

switch 18. Furthermore, signals propagating on the channels 45 attacks in AONs, an attack at a single point can lead to 

126, 12c have substantially the same carrier signal wave- widespread failures within the network. It is, therefore, 

length. Owing to the finite isolation characteristic of the important to be able to ascertain whether an attack is caused 

switch 18, a portion of the signal propagating along the by traffic itself or from a failure which occurs because of 

channel Mb can be coupled to the second switch channel natural fatigue of components or physical sabotage of the 

12c through a crosstalk channel 20. Thus, the crosstalk 5Q network. 

signal 15 propagates from the first switch channel 126 For example, assume there is an attack on a node i, which 

through the crosstalk channel 20 to the second switch carrics channels 1, 2 and 3, from channel 1. If a network 

channel 126 resulting in a second crosstalk signal 21 propa- management system deals with all failures as though they 

gating on the channel 12c. were benign failures (e.g. a failure due to component 

Since the carrier signals propagating in channels 12a, 126 55 fatigue), then the network management system assumes that 

and 12c each have substantially the same wavelength, if the node i failed of its own accord and reroutes the three 

amplitude of the crosstalk signal is sufficiently large, dis- channels to some other node, say nodej. After that rerouting, 

ruption of the signals propagating on the channel 12c can node j will appear as having failed because channel 2 will 

occur. attack node j. The network may then reroute all three 

In this case both nodes 17a, 176 may correctly recognize 60 channels to node k, and so on. Therefore, it is important for 

the failure as a crosstalk jamming attack. Node 17a will node i under attack to be able to recognize an attack coming 

correctly ascertain that the offending channel is channel 12a from its traffic stream and to differentiate it from a physical 

but node 176 will ascertain the offending channel as channel hardware failure which is not due to the traffic streams 

126. If the network has no means of localizing the source of traversing node i. 

the attack, then node 17a will disconnect channel 12a and 65 Attacks such as the amplifier and switch attacks discussed 

node 176 will disconnect channel 126. Channel 126 will, above can lead to service denial. The ability to use attacks 

therefore, have been erroneously disconnected. Thus, to to deny service stems from the fact that attacks can spread, 
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causing malfunctions at several locations, whereas failures FIGS. 4 and 4A, in which like elements are provided 

generally do not disrupt the operation of several devices. having like reference designations, the manner in which a 

Thus, while a single network element failure may cause single attack may lead to service disruption in the case of 

several network elements to have corrupted inputs and loopback recovery is shown. 

outputs, the failure will not generally cause other network 5 Referring briefly to FIG. 4 a portion of a network 30 

elements to be defective in their operation. includes network nodes j, k. For purposes of illustration, 

assume node j is the attack source (i.e. node j is attacked, for 

SUMMARY OF THE INVENTION instance by a nefarious user using node j as a point of entry 

In view of the above, it has been recognized that since the mto me network for insertion of a spurious jamming signal), 

results of component failures and attacks are often similar 10 The jamming signal causes the nodes adjacent to node j to 

(e.g. improper operation of one or more network compo- infer that node j has failed, or is "down." The same jamming 

nents or nodes), the difference is transparent to a network signal, upon traveling to node k, will cause the nodes 

node or system user. Because of this transparency there is no adjacent to node k to infer that node k has failed. If both 

absolute metric to determine whether an input is faulty or nodes j and k are considered as individual failures by a 

not. Instead, it is necessary to examine the operation of a 15 network management system, then loopback will be per- 

node, i.e., the relation between the input and the output A formed to bypass both nodes j and k in a ring. Thus, all traffic 

failure will lead to incorrect operation of the node. An attack, which passed through both nodes j and k will be disrupted, 

as illustrated above in conjunction with FIGS. 1 and 2, can as indicated by path 31 in FIG. 4 by the loopback at each of 

cause network elements not only to have corrupted inputs the nodes j, k. 

and outputs, but the nature of those corrupted inputs can lead 20 Referring now to FIG. 4A, if node j is correctly localized 

to improper operation of the network elements themselves. as the source of the attack, then loopback effected to bypass 

Hence, if alarms are raised at individual network elements node j will lead to correct operation of the network, with 

by improper operation of the network element, a fault will only the inevitable loss of traffic which had node j as its 

lead to a single alarm. An attack, on the other hand, may lead destination or origination. Traffic which traversed node j 

to alarms in several nodes downstream (in the flow of 25 from node i is backhauled through node j. Thus, by correctly 

communications) of the first node or network point which is localizing the source of an attack, the amount of traffic which 

attacked. Thus, if a restoration scheme is prepared to recover is lost can be reduced. 

from failures but encounters instead an attack, the restora- Briefly, and in general overview, work in the area of fault 

uon scheme itself may malfunction and cause failures. localization in current data networks can be summarized and 

FIGS. 3, 3A illustrate SONET/SDH approaches to recov- categorized as three different sets of fault diagnosis frame- 

ery schemes. These recovery schemes are based on rings. works; (1) fault diagnosis for computing networks; (2) 

SONET/SDH, allow for network restoration after failure probabilistic fault diagnosis by alarm correlation; and (3) 

using two techniques illustrated respectively in FIGS. 3 and fault diagnosis methods specific to AONs. 

3A. 35 The fault diagnosis framework for computing networks 

Referring now to FIG. 3 a ring 24 having network nodes covers those cases in which units communicate with subsets 
24a-24e utilizes a recovery technique typically referred to of other units for testing. In this approach, each unit is 
as automatic protection switching (APS). The APS tech- permanently either faulty or operational. The test od a unit 
nique utilizes two streams 26a, 26b which traverse physi- to determine whether it is faulty or operational is reliable 
cally node or link disjoint paths between a source and a ^ only for operational units. Necessary and sufficient condi- 
destination. In this particular example, stream 26a couples a tions for the testing structure for establishing each unit as 
source node 24a to a destination node 24d with information faulty or operational as long as the total number of faulty 
flowing in a clockwise direction through intermediate nodes elements is under some bound are known in the art. 
246, 24c. Stream 26b y on the other hand, couples the source Polynomial-time algorithms for identifying faults in ctiag- 
node 24a to destination node 24d with information flowing 45 nosable systems have been used. Instead of being able to 
in a counterclockwise direction through intermediate node determine exactly the faulty units, another approach has 
24e, In case of failure of a node or link along one of the been to determine the most likely fault set. 
streams, e.g. stream 26a, the receiving node listens to the All of the above techniques have several drawbacks. First, 
redundant, backup, stream e.g. stream 266. Such a technique they require each unit to be fixed as either faulty or opera- 
is used in the SONET unidirectional path switched ring 50 tional. Hence, sporadic attacks which may only temporarily 
(UPSR) systems. disable a unit cannot be handled by the above approaches. 

Referring now to FIG. 3A a ring 28 having network nodes Thus, the techniques are not robust. Second, the techniques 

2Sa-2Se utilizes a recovery technique typically referred to require tests to be carefully designed and sequentially 

as loopback protection. In the loopback approach, in case of applied. Moreover, the number of tests required rises with 

a failure, a single stream 29a is rerouted onto a backup 55 the possible number of faults. Thus, it is relatively difficult 

channel 29b. Such an approach is used in the SONET to scale the techniques. Third, the tests do not establish any 

bidirectional line switched ring (BLSR). type of causality among failures and thus the tests cannot 

For any node or edge redundant graph, there exists a pair establish the source of an attack by observing other attacks, 

of node or edge-disjoint paths, that can be used for APS, The techniques, therefore, do not allow network nodes to 

between any two nodes. Automatic protection switching 60 operate with only local information. Fourth, fault diagnosis 

over arbitrary redundant networks need not restrict itself to by many successive test experiments may not be rapid 

two paths between every pair of nodes, but can instead be enough to perform automatic recovery, 

performed with trees, which are more bandwidth efficient for The probabilistic fault diagnosis approaches for perform- 

multicast traffic. For loopback protection, most of the ing fault localization in networks typically utilize a Bayesian 

schemes have relied 00 interconnection of rings or on 65 analysis of alarms in networks. In this approach, alarms 

finding ring covers in networks. Loopback can also be from different network nodes are collected centrally and 

performed on arbitrary redundant networks. analyzed to determine the most probable failure scenario. 
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Unlike the fault diagnosis for computing networks messages using local communication between first and 

techniques, the Bayesian analysis techniques can be used to second nodes wherein a first one of the nodes is upstream 

discover the sourcc(s) of attacks thus enabling automatic from a second one of the nodes and wherein each of the one 

recovery. Moreover, the Bayesian analysis techniques can 0 r more messages indicates that the node transmitting the 

analyze a wide range of time-varying attacks and thus these 5 meS sage detected an attack at the message transmitting 

techniques are relatively robust. All of the above results nodc; and (c) proccssing messages received in a message 

however, assume some degree of centralized processing of processing one of the first and second nodes to determine if 

alarms, usually at the network and subnetwork level. Tnus, the ffl processing node ^ first Dode to m ^ 

one problem with this technique is that an increase in the on a cerUm channeL Wlih ^ particular arrangement, a 

size of the network leads to a concomitant increase in the 1Q ^ for findin ^ ^ of an attackin si a] ^ 

time and complexity of the processing required to perform _ . , , „ . j ♦ • c * , 

* I i t- • ^ provided. By processuig node status mformation at each 

au oc iza ion. n(X j e m tQe network generating responses based on the 

Another problem with the Bayesian analysis techniques is Qode status information and the messages received by the 
that there are delays mvolved with propagation of the node> me technique can be used to determine whether an 
messages to the processing locations. In networks having a 15 aMack ^ caused by network traffic or by failure of a network 
relatively small number of processing locations, the delays clcmcnt or component In this manner, an attack on the 
are relatively small. In network's having a relatively large nct work can be localized. By localizing the attack, the 
number of processing locations, however, the delays may be network maintains quality of service. Furthermore, while the 
relatively long and thus the Bayesian analysis techniques technique of the present invention is particularly useful for 
may be relatively slow. Thus the Bayesian analysis tech- 20 10^^ of propagating attacks, the technique will also 
mques may not scale well as network data rates increase or localize component failures which can be viewed as non- 
as the size of the network increases. If either the data rate or pr0 p agat in g atta cks. The technique can be applied to per- 
the span of network increase, there is a growth in the latency form loopback restoration as well as automatic protection 
of the network, Le. the number of bits in flight in the switching (A PS). Thus, a technique provides a means for 
network. The combined increase in processing delay and in 25 uulizing attack localization with a loopback recovery tech- 
latency implies that many bits may be beyond the reach of niquc 0f an ^ ^^que to avoid unnecessary service 
corrective measures by the time attacks are detected. denial ^ nodcs includc a rcsp onse processor which pro- 
Therefore, an increase m network span and data rate would cesses incoming messages and local node status information 
lead to an exacerbation of the problem of insufficiently rapid to determine the response of the node. The particular 
detection. 30 response 0 f eacn node depends upon a variety of factors 

For AONs, fault diagnosis and related network manage- including but not limited to the particular type of network, 
ment issues have been considered. Some of the management me particular type of recovery scheme (e.g. loopback or 
issues for other high-speed electro-optic networks are also automatic protection switching), the particular type of net- 
applicable. The problem of spreading of fault alarms, which wor k application and the particular goal (e.g. raise an alarm, 
exists for several types of communication networks, is 35 rcr oute the node immediately before and/or after the 
exacerbated in AONs by the fact that signals flow through attacked node in the network, etc . . . ). 
AONs without being processed. To address faults only due 

to fiber failure, only the nodes adjacent to the failed fiber BRIEF DESCRIPTION OF THE DRAWINGS 

need to find out about the failure and a node need only The foregoing features of the invention, as well as the 

switch from one fiber to another. For failures which occur in 40 invention itself may be more fully understood from the 

a chain of in-line repeaters which do not have the capability following detailed description of the drawings, in which: 

to switch from one fiber to another, one approach is when a FIG. 1 is a block diagram illustrating a network attack 

failure occurs, the alarm due to the failure is generated by the implemented through a switch and an amplifier, 

in-line repeater immediately after the link failure. The mr- *> ~ wi 1, a- *n ^ *• *_ _i .* ^ 

, ^ . \ \ » . x c FIG. 2 is a block diagram illustrating a network attack 

failure alarm then travels down to a node which can perform 45 ■ i„ m ^ nt „A *i „„u „ Tl - „ f ~ 

r . . , , implemented through a pair 01 switches; 
failure diagnostic. The failure alarms generated downstream , . , . , , . ... , , P 
of the first failure are masked by using upstream precedence. ^G. 3 is a block diagram UlustraUng the results of an 
Failure localization can then be accomplished by having the ^atic protecUon switchmg technique in a ring network; 
node capable of diagnostics send messages over a supervi- , ™. , 3A 15 a block ^agrarn illustrating the results of a 
sory channel towards the source of the failure until the 50 ^pback protection technique in a ring network; 
failure is localized and an alarm is generated at the first FIG. 4 is a block diagram illustrating a first possible result 
repeater after a failure. These techniques require diagnostic of loopback recovery when a pair of nodes detect an attack 
operations to be performed by remote nodes and to have and 0010 nodcs m Relieved to be faulty; 
two-way communications between nodes. FIG. 4A is a is a block diagram illustrating a second 
It would, therefore, be desirable to provide a technique for 55 P ossiblc rcsult of loopback recovery when a pair of nodes 
stopping an attack on a signal channel by a nefarious user dctcct and mc attack source ,5 localized; 
which does not result in service degradation or denial. It FIG. 5 is a block diagram of a network; 
would also be desirable to provide a technique for localizing FIG. 6 is a flow diagram illustrating the processing steps 
an attack on a network. It would further be desirable to performed by nodes in a network to perform attack local- 
provide a relatively robust, scalable technique which local- 50 ization; 

izes rapidly the source of an attack in a network and allows FIG. 7 is a flow diagram illustrating the processing steps 

rapid, automatic recovery in the network performed by nodes in a network to ascertain a fault type and 

In accordance with the present invention, a distributed transmit the fault type to adjacent nodes in the network; 

method for performing attack localization in a network FIG. 7 A is a flow diagram illustrating the processing steps 

having a plurality of nodes includes the steps of (a) 65 performed to determine whether a node is the source of an 

determining, at each of the plurality of nodes in the network, attack or whether the source of the attack is an upstream 

if there is an attack on the node; (b) transmitting one or more node; 
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FIG. 8 is a block diagram illustrating a propagating attack describe a hardware failure in a network and in particular a 

which does not disrupt all nodes in a channel; hardware failure of a network element. 

FIG. 9 is a flow diagram illustrating processing steps The term "attack localization" or more simply "localiza- 

performed by nodes in a network to determine whether the tion" refers to the process by which the source of an attack 

node is the source of an attack or whether the attack is being 5 in the network is isolated. The same process can also 

carried by data from a node upstream. pinpoint other nodes in the network which may experience 

FIG. 10 is a flow diagram illustrating processing steps a failure due to an attack but w^'ch are not the source of an 

performed by nodes to utilize transmission of messages to attack. 

nodes which are upstream in a network; It should be noted that the techniques of the present 

FIG. 11 is a flow diagram illustrating processing steps 10 invention have applicability to a wide variety of different 

performed by nodes to determine whether a response should tyP^ of networks and is advantageously used in those 

be an alarm response or an alert response; applications which provide relatively high-speed optical 

FIG. 12 is a block diagram illustrating automatic protec- communications. For example the techniques may be used 

uon switching performed I in accordance wim me techniques 15 m S0NET ^ S °, H 

of the present invention* networks which each include network restoration protocols. 

™™ ' - * a * .„ . It should be noted that although SONET/SDH are not 

FIGS. 13, 13A are a series of flow diagrams illustrating all lical standaids , thc ratcs supported by these standards 

processing steps performed by nodes to implement loopback make thcir nccd for id rcstoratioo commensurate 

recovery m accordance with the present invention; and ^ that of A0Ns . ^ the techniques described herein 

FIG. 14 is a block diagram illustrating loopback protec- 20 find app ii ca jbility in any network having a need for rapid 

tion performed in accordance with the techniques of the service restoration. 

present invention. Referring now to FIG. 5, a network 40 includes a plurality 

DESCRIPTION OF THE PREFERRED °^ no ^ cs N1-N6, generally denoted 42. Each of the nodes 42 

EMBODIMENTS is P^ 0 ^ 55 ^ 5 a predetermined number of communication chan- 
nels (e.g., channel 48a) coupled thereto via respective ones 

Terminology of communication links 48. Each of the nodes 42 includes a 

„ , , n response processor 43 which processes incoming messages 

Before describing the apparatus and processes for per- to me node ( lQMessages) and local node status information 

forming fault isolation in communication networks, some to deterraij3e me respoQse of me aode 42 whic h receives the 

mtroductory concepts and terminology are explained. The 30 mcomiQg messa s Eacn of the channels terminate or 

term network as used he rem refers to a collection of origmatc al nodes 42 and each channel has a dfic 

assets, switching apparatus and conduits which permit the dilcciion (Lc . signals are transmitted in a particular direction 

U-ansnussion of resources. Thus, the networks may be used m cach communication channel). Thus, with respect to a 

for communications systems, data transmission systems, particular channcU nodcs can ^ rcfcrred t0 as being 

information systems or power systems. In one embodiment, 35 Qr downstream of one 

the network may be provided as an internet The resources „ . . . A . . . , - VT « 

. l -i ror example, in one communication channel the node Nl 

may be provided as optical signals such as power signals, . , r - ' , kM , lL , lt1 . . 

information signals, etc.... " .upstream lof the node N2 and the node Nl b downstream 

°^ oi the node No. In another communication channel, 

The term "node" refers to a component or group of ^ however> me node N1 may be dow nstream of the node N2 

components or a processing element or a combination or and me node N i may be upstream of the node N6. Each node 

group of processing elements in a network. A source node N2 is able to detect and recognize attacks being levied 

refers to a point of origin of a message and a destination against % receive and process messages arriving to it and 

node refers to an intended point of receipt of a message. A generate and transmit messages to nodes which are upstream 

first node in a seriesof nodes affected by an attack is referred or downstream Q f it on certain channels, 

to as the "source of the attack", or an "attack source" even _ . . _. , . , . r . r , 

iu ^ r\ f ' , ^T~f , . It should be noted that for the purposes of the present 

though the attack may have been launched at a different - *- j *, , - t * , 

. . » _ . invention, a node may correspond to a single network 

point in the network. t , , , , 

r component. Alternatively a smgle network component may 

A "channel" refers to a combination of transmission be represented as more than one node. For example, a switch 

media and equipment capable of receiving signals at one 5Q may be represented ^ nodes, one node for each 

point and delivering the same or related signals to another switching plane of the switch. Likewise, in some applica- 

point. A "message", refers to information exchanged tions it ^ be advantageous to represent a multichannel 

between nodes in a network. Thus messages are transmitted amplifier as a single node while in other applications it may 

between nodes on channels. be advantageous to represent the multichannel amplifier as 

The term 'failure" refers to any malfunction which (a) 55 multiple nodes. Alternatively sttfl, a cascade of in-line 

affects the input-to-output relationship at a node of a net- amplifiers may be modeled as a single node because they 

work; or (b) leads or to imperfect or incorrect operation of have a single input and a single output 

a network, network transmission media or a network node MiCT rcading t hc techniques described herein, those of 

(e.g. a malfunction or degradation of a node or a link due to ordinary skill in the art will appreciate how to advanta- 

natural fatigue of components or physical sabotage of the 60 geously repr esent particular network components and when 

network). to represent multiple components as a single node and when 

The term "attack" refers to a process which causes a to represent a single component as a network node. In 

failure at one or more links or nodes. An attack is a process making such a determination, a variety of factors are con- 

which affects signal channels having signal paths or routes sidered including but not limited to the ability of a node or 

which share devices with a nefarious user's channel. 55 network element or component to detect a failure (e.g. it may 

The term "fault" refers to a component failure of a be preferable to not represent an element as a node if the 

network element. Typically, thc term "fault" is used to element can't detect a failure) and the importance in any 
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particular application of having the ability to specifically One concept included in the present invention is the 
localize a network element (e.g. in some applications it may recognition that, in order for a node to determine whether or 
be desirable to localize an attack to a node which includes not it is the source of an attack, it need only know whether 
many elements while in other applications it may be desir- a node upstream of it also had the same type of attack. For 
able or required to localize an attack to a particular element 5 example, suppose that node 1 is upstream of node 2 on a 
within a node). This depends, at least in part, on where the certain channel 48a which is ascertained as being an attack- 
processing capability exists within a network. Depending ing channel and that both node 1 and node 2 ascertain that 
upon the particular application, other factors may also be the attacking channel is channel 48a. Suppose further that 
considered. both nodes 1 and 2 have processing times T™" and T"™. 

Each of the nodes 42 has one or more inputs I t> and 1Q If node 1 transmits to node 2 its finding that the channel 48a 

outputs O tf with corresponding directed connections denoted is nefarious, then the interval between the time when the 

as (i, j) when the connection is made from node i to node j attack hits node 2 land node 2 receives notice from node 1 

by a link. An undirected connection between nodes i and j mat the attack also hit node 1 is at most since the 

is denoted herein as [i, j]. In FIG. 5, node inputs and outputs attack md me message concerning the attack travel together, 

are designated 49 and for simplicity and ease of description _ Moreover, the detection the attack commences at node 2 

each of the nodes 42 include a single input and a single as soon as me attack hits. Hence, the elapsed time from when 

output generally denoted 48. The notation T 12 indicates the me ^J** n 1 ode 2 d * tects ll ? e . ^Sl^f™! 

.•A * ™ H tMnemi * „ mflCcnno . . whether node 1 also saw that attack is T m * tt VF' oc . It should 

time required to transmit a message on a channel between , A , iL , it _ . 4 . . . . , ~ . , . ( 

, , 7 ■ c a be noted that this time is independent 01 the delay in the 

nodes 1 and 2 on which channel information flows in a ... . „ A r + , * , 4 , „ 1 

. , j .1 . -1 r , •„ . communications between nodes 1 and 2 because the attack 

direction from node 1 to ^node 2. Those of ordinary skill m 2Q and me m concerning the attack travel together, 

the art will, appreciate .of course that in practical networks ted b a ^ dgl If me ^ ^ * od 

many of the nodes will have multiple inputs and outputs. each ^ 0Qly ^ ^V-VF" tQ determine whether 

Network 40 and the networks referred to and described or not it fe me first ^ t0 detect mat attackj u wnether it 

herein below are assumed to be acyclic (i.e. the particular ^ tae 0 f tne 

communication path along which the information is trans- 25 To iUustratc the technique, it is useful to consider a 
nutted contains no cycles). relatively simple attack localization problem. In this net- 
In general overview, the network 40 operates in accor- work nodes can cithcr navc a status of x ( 0 or 0 ( a i arm ). 
dance with the present invention in the following manner. A Nodes monitor messages received from nodes upstream. Let 
distributed processing occurs in the nodes 42 to provide a me message be the status of the node. When an attack occurs 
technique which can rapidly ascertain the one or ones of the 30 in this network, the goal of the techniques set forth in 
nodes 42 are sources of an attack. It should be noted that the accordance with the present invention is that the node under 
nodes 42 include some processing capability including altack respond with an alarm and all other nodes respond 
means for detection of failures. The ability to provide the ^th O.K. 

nodes with such processing capability is within the skill of During the processing, once an attack is detected at a 
one of ordinary skill in the art. Thus, the nodes 42 can detect 35 node, node 2 in network 40 for example, node 2 initiates 
failures with satisfactory false positive and false negative processing to ascertain the source of the attack by transmit- 
probabilities. The ability to localize attacks in the network is ting its own node status to other nodes and receiving the 
provided in combination by the distributed processing which status of other nodes via messages transmitted to node 2 
takes place in the network. from the other nodes. It should be noted that the nodes from 
The techniques of the present invention for attack local- 40 which node 2 receives messages may be either upstream or 
ization are, therefore, distributed and use local communica- downstream nodes. In response to each of the messages 
tion between nodes up- and down-stream. Each node 42 in received by node 2 which meet a predetermined criteria (e.g. 
the network 40 determines if it detects an attack. It then the messages are received at node 2 within a predetermined 
processes messages from neighboring nodes 42 to determine period of time such as [t-T™* 771 , t+T* 5477 ^] node 2 transmits 
if the attack was passed to it or if it is the first node to sustain 45 response messages which provide information related to the 
an attack on a certain channel. The first node affected by an identity of the attack source. It should be noted that in some 
attack is referred to as the source of the attack, even though embodiments the response can be 40 ignore messages, 
the attack may have been launched elsewhere. The global Similarly, each of the nodes 42 in network to receive 
success of localizing the attack depends upon correct mes- messages and in response to particular ones of the messages, 
sage passing and processing at the local nodes. 50 the nodes provide information related to the identity of the 
In describing the processing which take place at particular source of the attack. The particular response messages will 
nodes, it is useful to define some terms related to the timing vary in accordance with a variety of factors including but not 
of such processing. Time delays for processing and trans- limited to the particular network application and side effects 
mission of messages at each of the nodes 42 are denoted as such as, loopback, re-routing and disabling of messages, 
follows: 55 In performing such processing, each of the nodes 42 
^""^measurement time for node i including time to for- receives and stores information related to the other nodes in 
mat and send messages (where the measurement time is the network 40. Thus, the processing to localize the altack 
the time required to detect an attack); source is distributed throughout the network 42. 
T/^^processing time for nodes i including time to format With the above distributed approach, if node 2 is down- 
arid send messages (where the processing time is the time 60 stream from node 1 and node 2 detects a crosstalk jamming 
required to process received messages); and attack on the first channel and node 2 has information 
T^time to transmit a message from node i to node j on arc indicating that the node 1 also had a crosstalk jamming 
(i, j). attack on a second different channel, node 2 can allow node 
In some instances described herein below, the time delays 1 to disconnect the channel subject to the attack. Once node 
at all nodes are identical and thus the measurement and 65 1 disconnects the channel subject to the attack, the channel 
processing times are denoted as T"" and V™* without subject to the attack at node 2 ceases to appear as an 
subscripts. offending channel at node 2. 
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If node 2 did not have information from node 1 indicating 
that the channel at node 1 was subject to attack at node 1 
then node 2 infers that the attacker is attacking node 2 on the 
channel on which it detected the attack Node 2 then 
disconnects the channel. It should be appreciated that node 
2 sees no difference between the cases where channel 1 is the 
attacker at node 1 and where channel 2 is the attacker at node 
2. In both cases, channel 2 appears as the attacker at node 2. 
Thus, by using knowledge from the operation of node 1 
upstream of node 2, node 2 can deduce whether the attack 
originated with channel 1 or channel 2 thereby avoiding the 
result of erroneously disconnecting a channel which is not 
the source of an attack. Thus, the technique of the present 
invention allows the network to recover properly from 
attacks by identifying attacks carried out by network traffic 
and localizing those attacks. 

As mentioned above, each of the nodes 42 can detect an 
attack or fault within acceptable error levels. The type of 
faults detected are included in a set of fault types denoted F 
stored within a node storage device. One of the fault types 
in F is always a status corresponding to a no fault status 
meaning that the node has not detected a fault. 

The status of a node at a time t is denoted: 

in which: 

S(i) is the current status of node i; 
t is the time to which the current node status applies; and 
F is the set of all faults to which the status must belong 
(i.e. the current node status must be a status included in 
the set of faults F) 
Considering a connection between the nodes i and j along 
the arc (i, j), a message from node i to node j at time t is 

denoted M,(i,j). Messages can be sent upstream or down- 
stream in the network 40. The upstream message from node 

j to node i at time t is denoted M/Ij). F° r particular network 
applications the information encoded in messages varies but 
typically includes the node status information. Generally, 
however, the message can include any information useful for 
processing. It is, however, generally preferred that messages 
remain relatively small for fast transmission and processing. 
That is, each message should have a length for each appli- 
cation there is defined a particular maximum message 
length. The particular message length in any application is 
selected in accordance with a variety of factors including but 
not limited to the type of encoding in message, etc. . . . 
Moreover, the number and lengths of messages should be 
independent of network size. This allows the system to be 
scalable with respect to distance and/or number of nodes. If 
large messages based on network size are utilized, this 
results in toss of the scalability characteristic of the inven- 
tion because of long processing times which would result 
For example, a node can transmit its status upstream and 

downstream via the messages M/i j)=SX0 and M/ij)=S,(j). 

Message M,(ij) arrives at node j at time t+T ty and likewise 

message M,(ij) arrives at node i at time t+T.y. Again, the 

notation MXij) M/ij) indicates the current message 
from node i to j and node j to node i, respectively. 

A response function, R denotes processing of incoming 
messages and local status information to determine the 
response of the node which received the incoming message. 
The response function R will be discussed further below in 
the context of particular techniques implemented in accor- 
dance with the present invention. 
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In accordance with the invention, it has been recognized 
that it is necessary to explicitly take into account the time 
taken by the different processes involved in the identification 
and localization of attacks. The identification of an attack 

5 requires time for detection of the input and output signals 
and processing of the results of that detection. There is also 
delay involved in generating messages to upstream and/or 
downstream nodes. All the time required by all of the above 
processes executed in sequence is referred to as the process- 

iQ ing time at the node. Thus, the processing time at node 1 is 
denoted as T/"*"*. Messages from node i to node j take at 
most time T £j to transmit. Message transmission follows the 
transmission of the data itself, and does not usually add to 
the overall time of localizing the attack. Lastly, there are 
delays due to the time for capturing messages from upstream 

15 and/or downstream nodes, the time to process these mes- 
sages together with local information and the time to gen- 
erate new messages. We denote the time required by this last 
set of events as TP™. 

Thus, in accordance with the present invention, a network 

20 or network management system provides techniques for: (a) 
localization of the source of an attack to enable automatic 
recovery; (b) relatively fast operation (implying near con- 
stant operational complexity); (c) scalability-— the delay 
must not increase with the size and span of the network; (d) 

25 robustness — valid operation under any attack scenario 
including sporadic attacks. 

FIGS. 6-9, 10, 11, 13 and 13A are a series of flow 
diagrams which illustrate various aspects of the processing 
performed by various portions of network 40 to provide a 
communications network which utilizes a distributed tech- 
nique for performing fault isolation. The rectangular ele- 
ments (typified by element 50 in FIG. 6), herein denoted 
"processing blocks," represent computer software instruc- 
tions or groups of instructions. The diamond shaped ele- 
ments (typified by element 54 in FIG. 6), herein denoted 

35 "decision blocks," represent computer software instructions, 
or groups of instructions which affect the execution of the 
computer software instructions represented by the process- 
ing blocks. The flow diagrams do not depict syntax of any 
particular programming language. 

40 Alternatively, the processing and decision blocks repre- 
sent steps performed by functionally equivalent circuits such 
as a digital signal processor circuit or an application specific 
integrated circuit (ASIC). The flow diagrams do not depict 
the syntax of any particular programming or design lan- 

45 guage. Rather, the flow diagrams illustrate the functional 
information one of ordinary skill in the art requires to 
fabricate circuits or to generate computer software to per- 
form the processing required of the particular apparatus. It 
should be noted that many routine program elements, such 

so as initialization of loops and variables and the use of 
temporary variables are not shown. 

It will be appreciated by those of ordinary skill in the art 
that unless otherwise indicated herein, the particular 
sequence of steps described is illustrative only and can be 

55 varied without departing from the spirit of the invention. 
That is, unless otherwise noted or obvious from the context, 
it is not necessary to perform particular steps in the particu- 
lar order in which they are presented bereinbelow. 
Turning now to FIG. 6, each node in a network (such as 

60 nodes 42 in network 40 described above in conjunction with 
FIG. 5) repeats the following process steps and responds 
accordingly depending upon the node status and the status 
condition indicated in the messages received by the nodes. 
Processing begins with Step 50 in which the status of a 

65 node Nl is computed at a time t. Processing then proceeds 
to Step 52 where the node transmits a message including the 
node status information to nodes downstream. As shown in 
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decision block 54 if the node status is not an alarm status, process of FIG. 6. A more complex goal may be to reroute 

then processing ends. the node immediately before and after the attacked node in 

If in decision block 54 decision is made that the status is the network. The techniques of the present invention are 

an alarm status then processing flows to decision block 56 general enough to be suitable for a wide range of network 

where the node determines if any alarm messages have 5 goals. It should be noted that the particular processing steps 

arrived at the node in a pre -determined time interval. In one performed in the nodes (such as the set of faults, the format 

particular embodiment the predetermined time interval cor- of the messages and the node response to input messages) 

responds to the period of time between when the node status are defined for the particular network application and that 

of node 1 is computed, denoted as T and the measurement one of ordinary skill in the art will appreciate how to provide 

time after time T denoted as T+T""". The predetermined 10 nodes having the necessary capabilities in a particular appti- 

period of time thus corresponds to T" - ". cation. 

If the node Nl has not received any alarm messages arrive The above technique thus ascertains the fault type and 

in the pre-determined time interval, then processing flows to transmits it to adjacent nodes in the network. It then moni- 

Step 58 where the node's status is set as alarm (i.e. s=0) and tors incoming messages for a specified (bounded) time 

processing ends. If, on the other hand, the node received an is interval and responds to these messages. The response of the 

alarm message within the predetermined time interval, then network is particular to the particular network application, 

processing flows to processing block 60 where the node To achieve a particular network application, a fault set F 

status is set as okay (e.g. s=l). Processing then ends. must be defined, the waiting time interval for messages (i.e. 

From the above processing steps it can be seen that no T*""' 1 and T*" 1 ^, must be defined, the format of messages 

node will generate an alarm until at least one attack is 20 must be defined, the response function R must be defined 

detected. When an attack occurs only the first node expert- and the mode of message passing must be defined. The node 

encing the attack will respond with an alarm. All nodes can remove messages it receives from the message stream or 

downstream from the first node receive messages which pass all messages in the message stream, 

indicate that the node upstream experienced an attack. Thus, The response function R is responsible for achieving the 

nodes downstream from the attack will respond with O.K. 25 data transmission and security goals of the network. R is a 

This network response achieves the goal of attack localiza- function whose domain is the cross product of the set of 

tion. node statuses and the set of message lists and whose range 

Referring now to FIG. 7, a general processing technique is the set of message lists. This may be expressed in 

to ascertain fault types at one node and transmit the fault mathematical form as: 

types to adjacent nodes in a network begins in processing 30 

block 62 where a node i computes a node status S(i) at a time statusxMcssagciist— Mcssagetist 

t. While computing the node status any faults in the node can m wn j c j 1 . 

be ascertained and reflected in the status. Processing then ~ , , c , A 

flows to Step 64 where a response function R to be included Status corresponds to the set of node statuses; 

in a message M is computed. The response function R is 35 MessageList corresponds to the set of message lists (0 or 

computed from the node status S(i). The node response is morc messages on whatever format is being using); 

determined by the response function R which processes the x denotes a mathematical cross product; and 

node status information S(i) without regard to incoming -» denotes a mapping to a result space. 

messages. The response function R is preferably selected to be very 

Processing then flows to processing block 66 where 40 fast to compute in order to provide a relatively rapid 

messages M which include the response function R are technique. Ideally, the response function R should be pro- 

transmitted on arcs leaving the node. In a preferred vided from one or more compare operations and table 

embodiment, the messages are transmitted on all arcs leav- lookup operations performed within a network node. With 

ing the node. Processing then flows to processing Step 68 this approach, any delay in identifying faults and attacks is 

where the node collects messages arriving at the node within 45 relatively short and the network provides minimal data loss, 

a pre-determined time interval. In a preferred embodiment, As mentioned above, messages can move upstream or 

the j) redetermined time interval corresponds to t-T*' airl , downstream in the network. The response function receives 

t+T . The wait times T*™ 11 , V ¥auz are selected to result in all the messages at a node as input. It processes these 

each node having equal final processing times which can be messages to generate the messages for transmission from the 

equal to the maximum time required by any of the response so node. The response function generates messages which the 

functions. node transmits up- and down-stream. As will be discussed 

Processing then flows to Step 70 where responses for below, the response function R can be defined to handle a 

inclusion in messages M are computed in accordance with a variety of different network recovery applications including 

pre-determined response function R selected in accordance but not limited to loopback recovery and automatic protec- 

with the node status and the messages received within the 55 tion switching (APS) recovery. 

predetermined time interval. Additional action can be taken In addition, the response function R may have a side effect 

by node S, such as switching direction of communication. response, such as raising an alarm or re-routing traffic at a 

Resulting messages are then transmitted on arcs leaving the node. 

node as shown in Step 72. In a preferred embodiment the Each node, i, in a network can have a different response 
messages are transmitted on all arcs which leave the node. 60 function denoted as R,-. The use of different response 
Processing then ends. functions, with varying processing times, may, however, 
The general processing technique, like the simple result in race conditions in the network. In general, timing 
example of attack localization discussed above in conjunc- problems due to different response functions can be avoided 
tion with FIG. 6, is a distributed algorithm that achieves its by forcing all response functions in a network to operate in 
goal through local processing and message passing. The goal 65 the same amount of time. Thus in one approach, the pro- 
of the algorithm can vary for different network examples. cessing time is set to be the maximum time required by any 
For example, the goal may be to raise an alarm as in the of the response functions. Moreover, a wait time can be 



07/13/2003, EAST Version: 1.03.0002 



US 6,442,694 Bl 

17 18 

added to each response function such that its final processing the source of an attack; (2) if node j has detected an attack 

time is equal to the maximum time. It should be noted that and node i has detected no attack, then node j concludes that 

the response function R may return no message, or the it is the source of the attack; (3) if node j has detected an 

empty set, in which case no messages are transmitted. attack and node i has detected an attack, then node j 

With reference to FIG. 7 A, the problem of basic attack 5 concludes that it is not the source of the attack; and (4) if 

localization discussed above in conjunction with FIG. 6 is node j has detected an attack and has not received any 

re-considered. Recall that, for this problem, the nodes have messages from node i at time t«>max(T f .' ntfa5 , T/ Heas )+T ij ; then 

two fault types: no fault and fault (i.e. the fault set F includes node j concludes that it is the source of the attack, 

a value of 1 denoting no fault at the node and a value of zero It should be noted that node j completes processing at a 

denoting fault at the node), the status S of a node i is denoted 10 time corresponding to T i7 +max(T / m * as , T/ , ^ t5 )+T/ > ' t>c . An 

S(i) and the status must be set to a value in the fault set F. exhaustive enumeration of the possible timing constraints 

F, and messages from any node encode the status of the involving T/"***, T/"*", Tf moe 9 T (/ and a length of the attack 

node. The goal for node i is to determine whether it is the L, shows that node j is never owing to the technique of the 

source of the attack or if the attack is being carried by the present invention in the wrong state where the state is given 

data from a node upstream. Each node in the network repeats 15 with a delay, i.e. the node concludes at time t+m^x(TJ n€as , 

the processing steps shown in FIG. 7A. T i m * as )+T/ rec that it is the source of an attack if and only if 

In the general technique, the waiting times are set as it is the source of an attack at time t. 

follows: T^-O and T waUZ ^msLX t {J i meas ). Also, the mes- FIGS. 8 and 9, illustrate a scenario in the processing to 

sage passing parameter is set to remove all messages localize an attack when the propagating attack does not 

received. The response function R may be expressed 20 disrupt all nodes. 

described below in conjunction with FIG. 7A. In FIG. 7 A, Referring now to FIG. 8 a portion of a network which 

it is assumed that the node inputs are the node status s and illustrates a scenario in which the technique described in 

all messages receive within a predetermined time interval FIG. 9 may be used is shown. In this scenario, an attack is 

(denoted as InMessages): carried by a signal but the attack may not be detectable in 

Turning now to FIG. 7 A, processing begins in Step 74 25 some nodes. In this particular embodiment, consideration is 

where the node i generates a node status and receives all given to a specific attack scenario due to crosstalk. This 

messages which arrive at the node in a pre -determined time scenario should be distinguished from a scenario in which it 

interval (denoted as InMessages). If a fault is recognized, is assumed that, in the case of an attack which is carried by 

then the node status will reflect this. Processing then flows the signal, all the nodes through which the signal is trans- 

to decision block 76 where a decision is made as to whether 30 mitted will be affected by the attack, i.e. they will surfer a 

this is the response based on this node's status only or the failure. In the case where all nodes are affected by the attack, 

status of this node together with messages received from the the basic attack localization technique described in connec- 

upstream node in a predetermined period of time. tion with FIG. 7A can localize the source of such attacks. 

If processing this node status only, then processing flows In the scenario where the attack is not detectable in some 

to step 77 where the node status is returned and the pro- 35 nodes, as the signal traverses down the network it attacks 

ccssing ends. If processing received messages, then process- some nodes then reaches a node which it does attack and 

ing flows to decision block 78 where a decision is made as propagates through the node to attack downstream nodes, 

to whether the node status is a fault or a no fault status (i.e., For example, turning now to FIG. 8 consider an attack of 

fault equals 1, no fault equals 0). channel 86a at node 84, a switch, in the network nodes of 

If in decision block 78 the node status received in step 74 40 FIG. 8 Because of the finite isolation characteristics between 

is not a fault status, then processing flows to decision block two channels propagating through the switch 84 and the 

79 where a decision is made as to whether a message result and crosstalk at node 84, the output of channel 866 at 

received from an upstream node j at the node i is an alarm node 84 is affected by the attack. The signal in channel 866 

message. If a decision is made that this is an alarm message, men propagates to node 90, which is an amplifier. Since this 

then processing flows to Step 82 where the node returns a 45 signal is the only input to the node 90, gain competition is 

node status value of 1. If decision is made that the message not possible so the node 90 does not detect an attack. At node 

received from the node j is not an alarm message, then 92, however, channel 86c is once again affected by crosstalk 

processing flows to Step 80 where the node returns a value from the attack, thus an alarm is generated. The attack does 

of 0 and processing ends. propagate. It is detected in nodes 84 and 92, but it is not 

In localizing the attack, it is useful to look at the dynamics 50 detected at intermediate node 90. 
between two nodes and the connection between them. Each It is thus desirable to apply the attack localization tech- 
node monitors every connection into it. In one relatively nique of the present invention to this problem of not all 
simple example, a connection between nodes i and j, with nodes detecting an attack. To isolate the salient issues, the 
the data flowing from i to j is examined. simplest framework within which this problem can occur is 

Defining the time at which the data leaves node i as time 55 considered. Nodes 84, 90, 92 have two fault types. The first 

t«0, the message from node i to node j about node i's failure fault type is no fault (i.e. F=l) and the second fault type is 

is sent at time T/""*". Node j receives the data at time T (> , and fault (i.e., F=0). The message simply contains a status: fault 

completes the measurement and sends its status at time or no fault. The goal of the technique is unchanged, node 84 

T^+Ty""*. At this time node j has detected an attack or it has must determine whether it is the source of the attack or if the 

detected no attack. Node j receives the message from node 60 attack is being carried by the data from a source upstream, 

i at time T l> +T i ' nea *. Thus, node j can begin to process the The difference between this problem and the basic attack 

status message from i at time T^+max (Ty"*", T/"*" 5 ). At localization problem is that each node 84, 90, 92 must know 

this time node j has information indicating whether or not of the status at all the nodes upstream from it in the network, 

node i detected an attack, and node j has enough information whereas in the basic attack localization problem it is 

to determine whether or not it is the source of the attack. 65 assumed that when an attack propagates, every node in the 

Processing at node j falls into one of four cases: (1) if node network detects a fault so the status from the single preced- 

j has a detected no attack, then node j concludes that it is not ing node contains sufficient information from which to draw 
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conclusions. Instead of generating messages at each node, e.g. node j, receives the data stream at time t+T^ and waits 

the data is followed from its inception by a status message until time t+T 1 y+ r T*" 1If2 . For an all-optical network, switching 

which lags the data by a known delay. The status message is off a channel can be done in the order of nanoseconds with 

posted by the node at which the communication starts. Once an acousto-optical switch. The delay between nodes in the 

an attack is detected the status message is disabled. The lack 5 network would typically be larger, and thus it is not believed 

of a status message indicates to all the nodes downstream this condition will be problematic in practice. Moreover, the 

that the source of the attack is upstream of them. Note that network can be designed to ensure this condition is met by 

such a status message is akin to a pilot tone which indicates introducing delay at the nodes. Such a delay is easily 

that an attack or a fault has occurred. obtained by circulating the data stream through a length of 

With the above scenario in mind, one can define the fiber 

response function, R for selective attack localization, as Response to multiple fault types can be handled efficiently 

expressed below and in conjunction with FIG. 9. It should be with a lookup table. In the case of multiple fault types, the 

noted that the processing of FIG. 9 assumes that inputs to response function R would have a pre-stored table L. Given 

each node are the status S and all messages received within the current node status, s., and the status of the previous 

a predetermined period of time. node, s ; -, the lookup table provides the appropriate response 

Before describing the processing steps, it should be noted 15 for this node, r ( (i.e., L: Status of node ixStatus of node 

that the nodes in the network never generate messages. They j— ••Response.) For some applications it is useful to have 

can, however, disable the status message when they detect different lookup tables for the next node the network, and 

an alarm. When the status message is disabled, any node the previous node in the network, L p . Furthermore, the 

downstream can conclude that it is not the origin of the look-up tables can be extended to the domain of Statusx 

attack. 20 Response which gives greater flexibility. 

In the general technique the waiting times T"* ul and FIGS. 10 and 11 illustrate the processing required to 

T 0 ** 2 are set as T**" /1 =0 and T Wfl " 2 =max I (T l m " 1J ) and the repress alerts and to reduce the occurrence of alarm recov- 

message passing mode is to transmit all messages. ery. Alarm recovering refers to the steps required to route 

Referring now to FIG. 9, the response function for selec- around a node or physically send a person to fix the node 

tive attack localization is shown. Processing begins in Step 25 typically using manual techniques such a physical repair or 

94 where data at the source node of the communication is replacement of circuit components. Thus it is very expensive 

generated for transmission to the one or more destination to perform alarm recovery. 

nodes. Processing then flows to Step 96 and 98 where the Consider a node which detects signal degradation. The 
data is first transmitted to the next nodes and then a status signal may be amplified sufficiently by the next node down- 
message is transmitted to the next nodes. It should be noted 30 stream to remain valid when it reaches the destination. Since 
that the status message lags the data message by a pre- a valid signal reaches the destination node, it may thus be 
determined amount of time. undesirable to stop transmitting the signal or to re-route the 

Processing then flows to step 100 where the data is node that detected this problem. Instead, it may be prefer- 

received at the nodes. Immediately upon receipt of the data, able to continue network operation as usual and generate an 

the node can begin processing the data and can conclude that 35 alert signal, but not an alarm signal. There thus exist three 

an attack occurred prior to processing step 102 where the possible response values: (1) node status value is no fault or 

messages are received at the nodes. It should be noted that O.K. (e.g. S»l); (2) node status value is fault or alarm (e.g. 

the messages have a delay which is smaller than T"". S=0); and (3) node status value is alert. Thus, the source of 

Processing then proceeds to decision block 103 where an attack which is not corrected generates an alarm signal or 
decision is made as to whether an attack has been detected 40 alarm node status value whereas the source of a corrected 
at two nodes (i.e., the processing node and some other node). attack generates an alert signal or alert node status value. 
If decision is made that two nodes have not detected an The attack localization technique of the present invention 
attack, then processing proceeds to processing step 106 can achieve this behavior using upstream messages. Each 
where the node is determined to not be the source of the node must send status messages upstream as well as down- 
attack. Processing then Mows to decision block 107 which 45 stream. Upon detecting an attack in a node downstream, 
will be described below. If, on the other band, in decision messages are checked to determine if this node is the source 
block 103 decision is made that an attack has been detected of the attack. Upstream messages arc checked to determine 
at two nodes, then processing flows to decision block 104. if the attack persists in the next node downstream. When a 

In decision block 104 it is determined if the status node detects an attack it first generates an alarm. If it later 
message is enabled. If a node determines there is an attack, 50 finds that the problem was corrected downstream it down- 
trie node disables the message. If the status message has grades its alarm to an alert 

been disabled, then processing flows to processing block The response function for a network operating in acoor- 

106. If, on the other hand, the decision is made that the status dance with the above concepts is described in conjunction 

message is enabled, then processing flows to step 105 where with FIGS. 10 and 11. FIGS. 10 and U illustrate the 

the status message is disabled thus indicating that this node 55 processing which takes place at first and second nodes in a 

is the source of the attack. network. The second (node 2) is downstream from the first 

Processing then flows to decision block 107 where deci- node (node 1). FIG. 10 illustrates the processing which takes 

sion is made as to whether this node is the destination node. place at node 2 and FIG. 11 Illustrates the processing which 

If the node is not the destination node, then the data and the takes place at node 1. Each of the nodes repeats the 

status message (if not disabled) are transmitted to the next 60 respective processing steps described in conjunction with 

nodes as shown in processing blocks 108 and 110 and FIGS. 10 and 11. 

processing returns to Step 100. Steps 100 through 110 are Referring now to FIG. 10, processing begins in Step 112 
repeated until the destination node receives data and the where a node status is computed and proceeds to Step 114 
message. If the node is the destination node, then processing where the node status is transmitted to upstream and down- 
ends. 65 stream nodes. The processing then ends. 

Suppose a node i is attacked at rime t. The node turns off In FIG. 11, processing begins in processing block 120 

the status message at time t+T wl ^+T prt>c . The next node, where the a node status is computed. Processing then flows 
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to decision block 122 which determines if the node status is response function for all other nodes denoted as R„. The 

an alarm status. If in decision block 122 it is determined that response function R„ can be set to the response function 

the node status is not an alarm or a fault status, then described above in conjunction with FIG. 7A. One purpose 

processing ends. If, on the other hand, in decision block 122 of the destination node response function is to determine if 

it is determined that the node status is an alarm or a fault 5 the node receives any alarm messages. If the destination 

status, then processing flows to step 124 where the node node does not receive any alarm messages then the node 

receives messages from a downstream node (e.g. a node 2) may optionally transmit a node status message s. If the status 

within a pre-determined period of time. Processing then of the node is an alarm message, then the destination node 

flows to decision block 126 where decision is made as to performs any necessary processes to receive data on a back 

whether a message from the second node (node 2) is an 1Q up data stream. 

alarm message. If the message is not an alarm message, then Since the attack localization technique relies on messages 

processing flows to Step 130 where the response for node 1 arriving at nodes at specific times, a problem may arise if the 

is indicated to be an alert signal or an alert node status value two different response functions do not obey these timing 

and processing then ends. If in decision block 126 decision conditions. Since all nodes in the network except the des- 

is made that the message from node 2 is an alarm, then ti nation nodes use R n , the timing up to the destination node 

processing flows to Step 128 whether a response to node 1 35 will not result in a race condition. Since no node is waiting 

is an alarm signal or an alarm node status and processing for messages from the destination nodes, any differences in 

then ends. time will not affect nodes in the network. 

Upstream messages may follow the data stream by a Switching routes can entail certain routing problems, 

significantly longer time than do downstream messages. An Such problems can be avoided by delaying transmission on 

upstream message requires time for the data to traverse a 20 the backup path. The necessary delay on the backup path 

link (i, j) from node i to the next node, j. The status of node may be computed as follows. Denote by T"^ the time it 

j must be measured, and the message from node j to node i takes for the destination node 1346 to switch from the 

must traverse the link (i, j). Therefore the waiting time, primary path 136 to the backup path 138 after an alarm on 

in the attack localization technique is longer when the primary path 136 has been diagnosed by the destination 

upstream messages are monitored. In particular, for this 25 node 1346. Let AT represent the difference in transmission 

scenario the value of the waiting time is preferably set to a delay between the source node 134a and the destination 

value which takes into account such factors such as T" 0 * 2 - node 1346 between the primary stream 136 and backup 

2*max(T i -)+max,(T/ wt>c ). stream 138. It is assumed that the transmission delay is 

FIGS. 12-14 illustrate how the techniques of the present shorter on the primary path 136 and longer on the backup 

invention can be used for service network restoration after 30 path 138. Regardless of where the failure happened on the 

an attack for two important types of preplanned recovery primary path 136 no data will be lost in the process of 

schemes: (1) automatic path protection switching (APS) and detecting the problem and switching to the backup stream 

(2) loopback protection. These two preplanned recovery 138 as long as the data on the backup stream is transmitted 

schemes are the two types of network restoration used for with a delay of at least 

SONET/SDH. For each network restoration, a description of 35 

how the technique can be used to perform recovery and ^ nodcs * thc P ath CTf->*7— *-at 

provide the process steps that achieves the attack localiza- Those of ordinary skill in the art will appreciate of course 

tion is provided. that in some embodiments, thc transmission delay is longer 

APS allows the network to receive data on the backup on the primary path 136 and shorter 00 the backup path 138 

stream in the event of a faulty node. In the case of an attack, 40 and that appropriate changes in processing may be necessary 

service would be maintained if the attack is detected. The owing to such a condition. 

location of the attack, however, is unknown and restoring If all nodes 134 have the same T/"*"*, then no matter 

normal network operation may require a great deal of time where the failure occurs in the primary path 136, there is 

or be erroneous as discussed earlier. always the same delay between the data stream flowing on 

The attack localization technique described above in 45 the primary path 136 and the data stream flowing on the 

conjunction with FIG. 7 A provides the network the required backup data path 138 after APS. Therefore, the destination 

information to switch streams upon an attack or a fault. node 1346 need not to adapt its response to the location of 

Furthermore, the attacked node is ascertained so that the the failure. Independence from the location of the failure is 

attack can be dealt with quickly. The basic fault localization very advantageous for scalability of the network. Moreover, 

technique can be used to determine whether an attack took 50 having a single delay for all node results in simple optical 

place along the primary path. hardware at the destination node 1346 since adapting to 

FIG. 12 shows a network 132 having a plurality of nodes different delays on the fly requires significant complexity at 

134 including a source node 134a and a destination node the destination node. 

1346. Source and destination nodes 134a, 1346 are in Referring now to FIGS. 13 and 13A, response function 

communication via a primary path 136 provided from links 55 processing in accordance with the techniques of the present 

136a-136d* and a backup path 138 provided from links invention to provide loop-back restoration in the case of a 

138a-138d failure is shown. It should be noted that loop-back rcstora- 

If an attack took place along the primary path 136, there tion in the case of a failure is performed by the two nodes 

will be a message indicating the presence of such an attack adjacent to the failure. 

and lagging the attack by a time T"" traveling alongside 60 Processing begins in decision block 140 where decision is 

the primary path. The destination node will therefore know made as to whether a node has received an incoming 

that there was an attack upstream and that the destination message. If the node has not received an incoming message, 

node 134B was not the source of the attack. The response of then processing proceeds to Step 142 where the node posts 

the destination node 1346 will be to listen to the backup path a status message which indicates that the node has no 

or stream 138. 65 information concerning the identity of a source attack node. 

This network requires a first response function for desti- The status message may include, for example, with a don't- 

a second know flag. Processing then ends. 
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If in decision block 140 decision is made that the node has 
received at least one incoming message, then processing 
proceeds to Step 144 where each of the at least one incoming 
messages to be sent to both upstream and downstream nodes 
are shown. Processing then flows to decision block 146 5 
where decision is made as to whether the status included in 
the message is an attack status. If a decision is made that the 
status is not an attack status, then processing flows to 
decision block 148 where decision is made as to whether the 
downstream node detected an attack. 1Q 

If the downstream node did not detect an attack, then 
processing ends. Ifj on the other hand, in decision block 148, 
decision is made that the downstream node detected an 
attack, then processing flows to processing block 164 (FIG. 
13A) where the node immediately upstream of the attack 
node is re-routed when it receives a message having an 15 
indicator with a value which indicates that the node is under 
attack (e.g. the node receives an attack flag message). Thus, 
the response processor causes a re-routing of the node 
immediately upstream of the attacked node when it receives 
an {attack, flag} message. The upstream nodes need not wait 20 
for an {attack, mine} message because the attack does not 
propagate upstream. Processing then ends. 

If in decision block 146 decision is made that the status is 
an attack, then processing flows to decision block 150 where 
decision is made as to whether the node upstream detected 25 
an attack. If the node upstream did not detect an attack, 
processing flows to decision block 152 where the node posts 
a status message which indicates that the node has informa- 
tion concerning that identity of a source attack node. The 
status message may include, for example, a mine flag 30 
meaning that this node is the source of the attack. If, on the 
other hand, a decision is made that the upstream node 
detected an attack, then processing flows to step 154 where 
the node posts a status message which indicates that the node 
has information concerning that identity of a source attack 35 
node. The status message may include, for example, a 
not-mine flag meaning that this node is not the source of the 
attack 

Processing then flows to decision block 156 where deci- 
sion is made as to whether the node upstream is the source 40 
of the attack. If decision is made that the node upstream is 
not the source of the attack, then processing ends. If, on the 
other hand, decision is made that upstream is the source of 
the attack, then processing flows to step 158 where the node 
immediately downstream of the attack node is re-routed. 45 
Thus, when an {Attack^Mine} message is received from the 
source node the response processor causes a re-routing of 
the node immediately downstream of the attacked node. 
Processing then flows to step 160 where the node posts a 
not-mine flag and then to step 162 where a status message 50 
is transmitted with whatever flag has been posted. Process- 
ing then ends. 

Referring now to FIG. 14, loopback restoration, in the 
case of a failure, is performed by the two nodes adjacent to 
the failure. A network 170 includes a plurality of network 55 
nodes 172a-172g with node, 172/" corresponding to a source 
node and node 172d corresponding to a destination node. A 
data stream flows between the nodes on a primary stream or 
channel 174. If node 172a experiences a failure, the primary 
data stream 174 is re-routed at node 172g to travel on a 60 
backup channel 176. Simultaneously, the node 1726 receives 
information on the backup stream 176. This restoration 
maintains the connectivity of the ring network 170 and 
allows the data to reach the intended destination despite the 
failure at node 172a. 65 

Considering an attack at node 172a, as shown in FIG. 14. 
Node 172g is immediately upstream of node 172a which is 
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the source of the attack. Node 1726 is immediately down- 
stream of node 172a. The attack may spread so that each of 
nodes 1726-172<i will detect an attack while it will not be 
attacked directly. Each of these detected attacks will cause 
loopback recovery, resulting in multiple loopback and no 
data transmission. Thus, for these and other reasons dis- 
cussed above, detection of attacks as failures and utilization 
of a conventional loopback operation might not offer recov- 
ery from attack. 

The technique of the present invention is applied to 
loopback in the following way. In the event of an attack each 
node 172a-172g attempts to determine whether it is imme- 
diately upstream or immediately downstream of the attacked 
node. In the network 170, the node V72g finds that the node 
172a is the source of an attack (by monitoring upstream 
messages) and re-routes. The node 1726 also finds that the 
node 172a is the source of the attack and re-routes. All other 
nodes 172c-172/ find that they are not immediately 
upstream or downstream of the attack. Thus, these the nodes 
172c-172/do not re-route despite the detected attack. 

In the attack localization technique the wait time 1 W ' 2 
can be set to: T^^=2*max(T i j)+max I <T/ wc ) which gives 
the node time to monitor backward messages. Messages will 
consist of the couple (s, flag) where s is the status of the node 
(one of O.K. or Attack), and the status flag belongs to the set 
status flags {DontKnow,Mine,NotMine}. The status flags 
indicate whether the transmitting node is responsible for the 
fault or not, or that the node does not yet know if it is 
responsible for the fault. For this case we will remove 
messages from the message stream when they are processed. 
The response function R is as discussed above in conjunc- 
tion with FIGS. 13-13A 

Table 1 illustrates the messages posted at nodes 172a 
(node j), 1726 (node k) 172# (node i), and 172c (node 1), 
when an attack occurs at the node 172a. For simplicity it is 
assumed that all measurement are negligibly small and all 
transmission times are equal. This allows examination of the 
nodes at discrete time steps. 

Let the attack at node 172a occur at time t. At this time 
only node 172a detects an attack. At time t+1 node 172a 
receives an O.K. message from node 172g and finds it is the 
source. Node 1726 detects an attack and receives an attack 
message from node 172a indicating that it is not the source 
of the attack. Node 172g receives the attack message from 
node 172a and re-routes. At time t+2 node 1726 finds that 
the node 172a is the source of the attack and the node 1726 
is the next node downstream and re-routes. The node 172c 
also detects the attack at time t+2 and receives the {Attack, 
NotMine} message from the node 1726, thereby finding that 
it is not the source. Since the message indicates that the node 
1726 is not the source, the node 172c does not re-route. 

The timing issues are important, since the nodes 1726, 
172g act independently. If the node 1726 performs loopback 
before the traffic on the backup channel has reached the node 
1726, there will simply be a delay in restoration but no data 
will be lost. Loopback could fail, however, if the node 1726 
performs loopback after the loopback traffic from the node 
172g has arrived at the node 1726. There could be loss of 
data on the backup channel upon arrival at the node 1726. It 
can be shown, however, that this eventuality cannot occur. 

Let t be the time at which the attack hits the node 172a. 
At time t+max 07"", T/~0+T/""\ the node 172a will 
send a message to the node 172g informing it that the source 
of the attack is at the node 172a. The node 172g will receive 
the message that the node 172a is the source of the attack at 
rime t+max (T/**", T / mM 0+T ty and will finish TP"* later. If 
it takes the node 172g a time period corresponding to about 
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T/ 0 ** 7 to perform loopback, then the node 172g will perform 
loopback at time t+max (T/"*°*, T/^+T/^+T/^+T^ 
T / ioqp . The node 1726 will know that it is not the source of 
the attack at time t+max (T y m * M , T/ wa >T/'~+T yf . 
However, the information that is needed by the node 1726 is 
whether or not the node 172a is the source of the attack. 

The node 172b will know that the node 172a is the source 
of the attack and perform loopback at time t+max (T/""*, 
Tr™^T£*+T ik + r Xj oop . It may be assumed that all the 
times T^ are equal, all of the T*"* are equal and all of the 
T proc are equal. Such an assumption may be made without 
loss of generality because one could take the maximum of all 
these time periods and delay the others to match the maxi- 
mum. One can assume, as would be the case in AONs, that 
transmission delays are proportional to length. From 
elementary geometry, it is known that [T,y-T ; J is less than 
or equal to the transmission time from the node 172g to the 
node 172b. Therefore, no traffic from the node 172g to the 
node 172b placed on the backup channel by loopback will 
arrive at the node 1726 before the node 1726 has performed 
loopback. 
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processing messages received in a message processing 
one of the first and second nodes to determine if the 
attack was passed to the message processing node from 
another node or if the message processing node is a first 
node to sustain an attack on a certain channel. 

2. The method of claim 1 wherein prior to the step of 
transmitting the one or more messages, the method com- 
prises the step of transmitting data between said first and 
second nodes. 

3. The method of claim 2 wherein the localization of an 
attack at the message transmitting node requires a predeter- 
mined processing time at wherein the predetermined pro- 
cessing time includes: 

a detection time to detect the input and output signals and 
processing of the results of that detection, and 

a time delay associated with generating one or more 
messages for transmission to at least one of an 
upstream and a downstream node. 

4. The method of claim 3 wherein said detection time 
comprises: 

a time for capturing one or more messages from upstream 
and/or downstream nodes; and 



TABLE 1 



time 


node 172g 


node 172a 


node 172b 


node 172c 


t 

t + 1 
t + 2 
t + 3 


(OJCJDontXnow) (Attack.DontKnow) 
(OJL,NotMme)* (Attadc^ine) 
(O.K.,NotMine)* (Atteck,Mine) 
(OJL^NotMine)* (AUack,Mine) 


(O.K.,DontKnow) 
(Attack,NotMine) 
(Attack, NotMine)* 
(Attack,NotMine)* 


(OJC,DontKnow) 
(O.JC,DontKnow) 
(Attftck^fotMinc) 
(Attack^NotMinc) 



Table 1 shows messages and the side-effects of operating 
the response function R in conjunction with the processes 
and techniques described above in conjunction with FIGS. 
13, 13A when an attack occurs at time t at the node 172a. A 35 
decision to re-route is indicated with an asterisk (*). 

As indicated heretofore, aspects of this invention pertain 
to specific "method functions" implementable on computer 
systems. Those skilled in the art should readily appreciate 
that programs defining these functions can be delivered to a ^ 
computer in many forms; including, but not limited to: (a) 
information permanently stored on non-writable storage 
media (e.g., read only memory devices within a computer or 
CD-ROM disks readable by a computer I/O attachment); (b) 
information alterably stored on writable storage media (e.g., 
floppy disks and hard drives); or (c) information conveyed 45 
to a computer through communication media such as tele- 
phone networks. It should be understood, therefore, that 
such media, when carrying such information, represent 
alternate embodiments of the present invention. 

Having described preferred embodiments of the 50 
invention, it will now become apparent to one of ordinary 
skill in the art that other embodiments incorporating their 
concepts may be used. It is felt therefore that these embodi- 
ments should not be limited to disclosed embodiments, but 
rather should be limited only by the spirit and scope of the 
appended claims. 55 

What is claimed is: 

1. A method for r>erforming attack localization in a 
network having a plurality of nodes, the method comprising 
the steps of: 

determining, at each of the plurality of nodes in the 60 
network, if there is an attack on the node; 

transmitting one or more messages between first and 
second nodes wherein said first node is upstream from 
said second node and wherein each of the one or more 
messages indicates that the node transmitting the mes- 65 
sage detected an attack at the message transmitting 
node; and 



a time to process the captured messages together with 
local information. 

5. The method of claim 4 wherein: 

the first node is upstream of the second node on a first 

channel which is an attacking channel; 
both the first and second nodes identify the attacking 

channel; 

wherein the first node transmits to a second node a finding 
that the channel is nefarious and the interval between 
the time when the attack hits the second node and the 
second node receives a message from the first node that 
the attack also hit the first node is not greater than a first 
predetermined period of time; and 

wherein the localization of the attack commences at the 
second node as soon as the attack reaches the second 
node and the elapsed time until the second node iden- 
tifies the attack and determines whether the first node 
also detected that attack is not greater than a second 
predetermined period of time. 

6. A method for processing information in a node of a 
communication network comprising the steps of: 

(a) computing a node status S of a node Nl at a time T. 

(b) transmitting a message including the node status 
information S to nodes downstream; 

(c) determining if the status information indicates an 
alarm status for the node; 

(d) in response to the status information not indicating an 
alarm status for the node, ending processing; 

(e) in response to the status information indicating an 
alarm status at the node performing the steps of: 

(1) determining if any alarm messages arrive at the 
node within a predetermined time interval; 

(2) in response to no alarm messages arriving at the 
node within the predetermined time interval, setting 
the node status of the node to alarm; and 
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(3) in response to at least one alarm message arriving 
at the node within the predetermined time interval 
setting the node status of the node to no alarm. 

7. A method for processing information in a node of a 
communications network comprising the steps of: 5 

(a) computing a status of a node at a first time; 

(b) transmitting the one or more messages including the 
node status on arcs leaving the node; 

(c) collecting all messages arriving at the node within a 1Q 
predetermined time interval; 

(d) computing at least one response to be included in at 
least one message wherein each of the at least one 
responses depends upon a node status of the node and 
information contained in the messages which arrived at i$ 
the node within the predetermined time interval; and 

(e) transmitting at least one message including one of the 
at least one responses on arcs leaving the node. 

8. The method of claim 7, wherein the step of transmitting 
messages on arcs leaving the node includes the step of 20 
transmitting messages on all arcs leaving the node. 

9. The method of claim 1 wherein said message process- 
ing step comprises the steps of: 

(a) determining a status of the message processing node; 
and 25 

(b) concluding that said attack was passed to the message 
processing node if said status indicates an alarm and at 
least one message was received within a predetermined 
period of time by the message processing node from the 
message transmitting node indicating an alarm at the 
message transmitting node. 

10. The method of claim 9 wherein said message pro- 
cessing step further comprises the step of: 

(c) concluding the message processing node was the first 35 
node to sustain an attack on said channel if said status 
indicates an alarm and the message processing node 
does not receive a message indicating an alarm status at 
another node within the predetermined period of time. 

U. Apparatus provided at each of a plurality of nodes of ^ 
a network for identifying the location of a fault in the 
network, said apparatus comprising: 

(a) a fault detector for detecting a fault at a respective 
node and for providing a fault status signal indicating 
whether or not said node has experienced a fault and for 45 
transmitting said fault status signal to at least one other 
node in said network; and 

(b) a response processor responsive to said fault status 
signal of the respective node and to a message received 
by said node from another node in said network within 50 
a predetermined time for updating the fault status signal 

of the respective node. 
12. The apparatus of claim 11 wherein said response 
processor is operative to provide an updated fault status 
signal indicating that the respective node is the source of the 55 
fault if the fault status signal indicates detection of a fault 
and either (a) the respective node does not receive a message 
from another node in the network within the predetermined 
period of time or (b) the respective node receives a message 
from another node in the network within the predetermined 60 
time indicating that the other node did not experience a fault. 



13. The apparatus of claim 11 wherein said response 
processor is operative to provide an updated fault status 
signal indicating that the respective node is not the source of 
the fault if the fault status signal indicates detection of a fault 
and the respective node receives a message from another 
node in the network within the predetermined time indicat- 
ing that the other node experienced a fault 

14. A network comprising: 

a plurality of nodes, each one comprising: 

a fault detector for providing a fault status signal 
indicative of whether or not said node has experi- 
enced a fault; and 
a response processor responsive to said fault status 
signal and to one or more messages received at said 
node from other nodes in said network for determin- 
ing whether or not said fault originated at said node 
or whether said fault was propagated by the other 
node; and 

at least one channel interconnecting the plurality of nodes. 

15. A method for detecting the source of a fault in a 
network comprising a plurality of nodes, said method com- 
prising the steps of: 

(a) generating at a source node data for transmission to a 
destination node; 

(b) transmitting said data to a next node within said 
network; 

(c) transmitting a status message from the source node to 
said next node; 

(d) receiving said data at said next node; 

(e) receiving said status message at said next node; 

(f) said next node determining whether an attack has been 
detected at said next node and said source node; 

(g) determining whether a status message of said next 
node is enabled and: 

(1) if the status message is enabled, disabling said 
status message in order to provide an indication that 
the next node is the source of the attack and trans- 
mitting the data and status message to a further next 
node if the next node is not the destination node; and 

(2) if the status message disabled, providing an indi- 
cation that the next node is not the source of the 
attack. 

16. A method for optimizing alarm recovery, comprising 
the steps of: 

computing the status of a node; 

determining whether the computed node status is an alarm 
status; 

receiving a message at the node from a downstream node 

within a predetermined period of time; 
determining whether said received message is an alarm 

message; 

providing an alarm signal if said received message is an 
alarm message and providing an alert signal if said 
received message is not an alarm message. 
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Item [57], ABSTRACT, 

Line 14, delete "Since the" and replace with - The 
Column 1, 

Line 53, delete "owing" and replace with - owing to the --. 
Line 56, delete "signal ports" and replace with ~ signal ports --. 

Column 7, 

Line 17, delete "network's" and replace with - networks --. 
Line 45, delete "diagnostic*' and replace with « diagnostics --. 
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Column 12, 
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Column 14, 

Line 2, delete "to explicitly" and replace with - to 

Line 2, delete "account the" and replace with ~ account explicity the --. 
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line 12, delete "messages arrive" and replace with - messages «. 
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Line 36, delete "being using" and replace with - being used --. 
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Line 20, delete "expressed" and replace with ~ expressed as -. 

Line 24, delete "):" and replace with -- ). --. 
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